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WHAT IS CLAIMED IS: 

1 1 . A method of optimizing instructions included in a program being executed, the 

2 method comprising: 

3 collecting information describing a frequency of occurrence of a plurality of cache 

4 misses caused by at least one instruction; 

5 identifying a performance degrading instruction; 

6 optimizing the program to provide an optimized sequence of instructions, the 

optimized sequence of instructions comprising at least one prefetch 

8 instruction; and 

I 9 modifying the program being executed to include the optimized sequence. 

&■ 2 - The method of claim 1 , wherein the program comprises a plurality of sequence 

' 2 of instructions. 

■ 1 3. The method of claim 1, wherein the performance degrading instruction 

% contributes to highest frequency of occurrence of the plurality cache misses. 

T} 4 - The method of claim 1 , wherein the performance degrading instruction 

% contributes to highest degradation in the program performance. 

1 5. The method of claim 1 , wherein the at least one instruction is the performance 

2 degrading instruction. 

1 6. The method of claim 1 , wherein optimizing the program comprises inserting 

2 the at least one prefetch instruction prior to the performance degrading instruction. 

1 7. The method of claim 1 , wherein the plurality cache misses are L2/L3 cache 

2 misses. 

1 8. The method of claim 1 , wherein the optimized sequence is prepared while the 

2 program is placed in a suspend mode. 

1 9. The method of claim 8, wherein modifying the program comprises: 

2 changing the program from the suspend mode to the execution mode. 
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1 10. The method of claim 1 , wherein optimizing the program comprises: 

2 receiving information describing a dependency graph for the at least one instruction; 

3 determining whether a cyclic dependency pattern exists in the dependency graph; 

4 if the cyclic dependency pattern exists then, computing stride information derived 

5 from the cyclic dependency pattern; and 

6 inserting the prefetch instruction derived from the stride information, the prefetch 

7 instruction being inserted into the program prior to the performance degrading 

8 instruction. 

11- The method of claim 1 0, wherein the dependency graph is a backward slice 

:2 from the performance degrading instruction. 

$i 1 2. The method of claim 1 , wherein modifying the program comprises: 

?2 storing the optimized sequence; 

3 redirecting a sequence of instructions having the performance degrading instruction to 

4 include the optimized sequence. 

4 13 • A method of optimizing a program comprising a plurality of execution paths, 

d> the method comprising: 



3 collecting information describing a plurality of occurrences of a plurality of cache 

4 miss events during a runtime mode of the program; 

5 identifying a performance degrading execution path in the program; 

6 modifying the performance degrading execution path to define an optimized 

7 execution path, the optimized execution path comprising at least one prefetch 

8 instruction; 

9 storing the optimized execution path; and 

1 0 redirecting the performance degrading execution path in the program to include the 

1 1 optimized execution path. 

1 14. The method of claim 13, wherein the plurality of cache miss events are caused 



2 by an execution of a plurality of performance degrading instructions. 
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1 15. The method of claim 1 3 , wherein identifying the performance degrading path 

2 comprises identifying a performance degrading instruction contributing to highest plurality of 

3 occurrences of cache miss events. 

1 16. The method of claim 1 3 , wherein the optimized execution path is defined 

2 while placing the program in a suspend mode from the runtime mode. 

1 17. The method of claim 1 6, wherein the optimized execution path is executed on 

2 resuming the runtime mode of the program code from the suspend mode. 

Hi 1 8- The method of claim 1 6, wherein redirecting the performance degrading 

pi execution path comprises: 

changing the program mode from the suspend mode to the execution mode. 

S 19 - Tne method of claim 13, wherein the performance degrading execution path 

2 comprises a performance degrading instruction causing the cache miss event. 

M 

20. The method of claim 19, wherein the at least one prefetch instruction is 
inserted prior to the performance degrading instruction. 

fU 

1 21 . The method of claim 13, wherein identifying the performance degrading 

2 execution path comprises determining whether a cache miss event of the plurality of cache 

3 miss events is an L2/L3 cache miss. 

1 22. The method of claim 13, wherein identifying the performance degrading path 

2 comprises identifying a performance degrading instruction contributing to highest 

3 degradation in the program performance. 
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1 23. The method of claim 13, wherein modifying the performance degrading 

2 execution path comprises: 

3 receiving information describing a dependency graph for a program degrading 

4 instruction contributing to highest occurrence of the plurality of cache miss 

5 events, the performance degrading instruction being included in the 

6 performance degrading execution path; 

7 determining whether a cyclic dependency pattern exists in the graph; 

8 if the cyclic dependency pattern exists then, computing stride information derived 

9 from the cyclic dependency pattern; and 

§10 inserting the at least one prefetch instruction derived from the stride information, the 

|f at least one prefetch instruction being inserted into the optimized execution 

f% path prior to the performance degrading instruction. 

24. The method of claim 23, wherein the dependency graph is a backward slice 
s 2 from the performance degrading instruction. 

I J 25. A method of optimizing a program, the method comprising: 

E? receiving information describing a dependency graph for an instruction causing 

ft? frequent cache misses, the instruction being included in the program; 

4 determining whether a cyclic dependency pattern exists in the graph; 

5 if the cyclic dependency pattern exists then, computing stride information derived 

6 from the cyclic dependency pattern; 

7 inserting an at least one prefetch instruction derived from the stride information, the at 

8 least one prefetch instruction being inserted into the program prior to the 

9 instruction causing the frequent cache misses; 

1 0 reusing the at least one prefetch instruction in the program for reducing subsequent 

1 1 cache misses; and 

12 performing said receiving, said determining, said computing, said inserting and said 

1 3 reusing during runtime of the program. 
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1 26. A computer-readable medium having a computer program accessible 

2 therefrom, wherein the computer program comprises instructions for: 

3 collecting information describing a frequency of occurrence of a plurality of cache 

4 misses caused by at least one instruction; 

5 identifying a performance degrading instruction; 

6 optimizing the computer program to provide an optimized sequence of instructions, 

the optimized sequence of instructions comprising at least one prefetch 

8 instruction; and 

9 modifying the computer program being executed to include the optimized sequence. 

27. A computer system comprising: 

f;fi2 a processor; 

3 a memory coupled to the processor; 

^ a program comprising instructions, the program being stored in memory, the 

, f 5 processor executing instructions to: 

I f collect information describing a frequency of occurrence of a plurality 
y% of cache misses caused by at least one instruction; 

rf identify a performance degrading instruction; 

optimize the program to provide an optimized sequence of instructions, 

1 0 the optimized sequence of instructions comprising at least one 

I I prefetch instruction; and 

12 modify the program being executed to include the optimized sequence. 
13 



815514 v2 

Client Reference: P6823 



-22- 



